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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 241 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GCCAGGAATA ACTAGAGAGG AACAATGGGG TTATTCAGAG GTTTTGTTTT CCTCTTAGTT 60 

CTGTGCCTGC TGCACCAGTC AAATACTTCC TTCATTAAGC TGAATAATAA TGGCTTTGAA 120 

GATATTGTCA TTGTTATAGA TCCTAGTGTG CCAGAAGATG AAAAAATAAT TGAACAAATA 180 

GAGGATATGG TGACTACAGC TTCTACGTAC CTGTTTGAAG C C AC AG AAAA AAGATTTTTT 24 0 

T 241 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

CTAGAGAGGA ACAATGGGGT TATTCAGAGG TTTTGTTTTC CTCTTAGTTC TGTGCCTGCT 60 

GCACCAGTCA AATACTTCCT TCATTAAGCT GAATAATAAT GGCTTTGAAG ATATTGTCAT 12 0 

TGTTATAGAT CCTAGTGTGC CAGAAGATGA AAAAATAATT GAACAAATAG AGGATATGGT 18 0 

GACTACAGCT TCTACGTACC TGTTTGAAGC CACAGAAAA 219 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 231 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/ KEY : base_polymorphism 

(B) LOCATION: 3 

(D) OTHER INFORMATION: /note= M * N ' represents an A or G or 
T or C polymorphism at this position" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

TTNTGTAACG AAAAAACCCA TAATCAAGAA GCTCCAAGCC TACAAAACAT AAAGTGCAAT 60 

TTTAGAAGTA CATGGGAGGT GATTAGCAAT TCTGAGGATT TTAAAAACAC CATACCCATG 120 

GTGACACCAC CTCCTCCACC TGTCTTCTCA TTGCTGAAGA TCAGTCAAAG AATTGTGTGC 180 

TTAGTTCTTG ATAAGTCTGG AAGCATGGGG GGTAAGGACC GCCTAAATCG A 2 31 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
• (A) LENGTH: 237 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
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TGGGGGGTAA GGACCGCCTA AATCGAATGA ATCAAGCAGC AAAACATTTC CTGCTGCAGA 
CTGTTGAAAA TGGATCCTGG GTGGGGATGG TTCACTTTGA TAGTACTGCC ACTATTGTAA 
ATAAGCTAAT CCAAATAAAA AGCAGTGATG AAAGAAACAC ACTCATGGCA GGATTACCTA 
CATATCCTCT GGGAGGAACT TCCATCTGCT CTGGAATTAA ATATGCATTT CAGGTGA 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 216 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

CTTCCATCTG CTCTGGAATT AAATATGCAT TTCAGGTGAT TGGAGAGCTA CATTCCCAAC 
TCGATGGATC CGAAGTACTG CTGCTGACTG ATGGGGAGGA TAACACTGCA AGTTCTTGTA 
TTGATGAAGT GAAACAAAGT GGGGCCATTG TTCATTTTAT TGCTTTGGGA AGAGCTGCTG 
ATGAAGCAGT AATAGAGATG AGCAAGATAA CAGGAG 

(2) INFORMATION FOR SEQ ID NO : 6 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 201 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : base_polymorphism 

(B) LOCATION: 24 

(D) OTHER INFORMATION: /note= W 'N' represents an A or G 
T or C polymorphism at this position" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



AATTGATAGT ACAGTGGGAA AGGNCACGTT CTTTCTCATC ACATGGAACA GTCTGCCTCC 60 

CAGTATTTCT CTCTGGGATC CCAGTGGAAC AATAATGGAA AATTTCACAG TGGATGCAAC 12 0 

TTCCAAAATG GCCTATCTCA GTATTCCAGG AACTGCAAAG GTGGGCACTT GGGCATACAA 180 

TCTTCAAGCC AAAGCGAACC C 201 

(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 241 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GCAAATTCTT CTGTGCCTCC AATCACAGTG AATGCTAAAA TGAATAAGGA CGTAAACAGT 60 

TTCCCCAGCC CAATGATTGT TTACGCAGAA ATTCTACAAG GATATGTACC TGTTCTTGGA 120 

GCCAATGTGA CTGCTTTCAT TGAATCACAG AATGGACATA CAGAAGTTTT GGAACTTTTG 180 

GATAATGGTG CAGGCGCTGA TTCTTTCAAG AATGATGGAG TCTACTCCAG GTATTTTACA 24 0 

G 241 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 2 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO- 8- 

iiE EsEsEs -™ *— ssssss £ 

j«n™S ssss SSSSK sssss JESSE S5SS a 1 , 6 ? 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(1) SEQUENCE CHARACTERISTICS ■ 

(A) LENGTH: 233 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 9: 

SS SS S™ SSSF TTGGAGGAT * TCAGCCGAAC 
CCCACCAAGT CAAATCACAG IcC^GlTCC CTTCC ^TGC CTGACCAATA 

ATGGACAGCA CCAGgSSS SSS5SS SSSSSS S™™* 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 313 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE : 

(A) NAME/KEY: basejpolymorphism 

(B) LOCATION: 22 

(D) OTHER INFORMATION: /note, w *N' represents an A or G or 
T or C polymorphism at this position" 

(ix) FEATURE: 

(A) NAME/ KEY : base polymorphism 

(B) LOCATION: 44 

(DJ OTHER INFORMATION: /note- "IT represents an A or G or 
T or C polymorphism at this position" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

otSSS SSSSSS SSS2IS I gatgccaca sttnatgagg ataagattat 

AAGAATAAGT SSSK ^ GTTCAAC GTTATATCAT 

TACTACTGAT CTGTCACCAA SSSSJSS ^nnaSF* GATGATG CTC TTCAAGTAAA 
AAATATCTCA GA^SSSS JSSSSJ SSiSSS? SSSSSI TTAAACCAGA 
ATTTGGCATC AAA ATTTATTGCC ATTAAAAGTA TAGATAAAGC 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 242 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
180 
233 



60 
120 
180 
240 
300 
313 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AAGTATTCTT GATCTAAGAG ACAGTTTTGA TGATGCTCTT CAAGTAAATA CTACTGATCT 6 0 

GTCACCAAAG GAGGCCAACT CCAAGGAAAG CTTTGCATTT AAACCAGAAA ATATCTCAGA 120 
AGAAAATGCA ACCCACATAT TTATTGCCAT TAAAAGTATA GATAAAAGCA ATTTGACATC ISO 
AAAAGTATCC AACATTGCAC AAGTAACTTT GTTTATCCCT CAAGCAAATC CTGATGACAT 240 

242 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: f 

(A) LENGTH: 208 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/ KEY : basejpolymorphism 

(B) LOCATION: 2 

(D) OTHER INFORMATION: /note* w % N' represents an A or G or 
T or C polymorphism at this position" 

(ix) FEATURE: 

(A) NAME /KEY : base_polymorphism 

(B) LOCATION: 4 

(DJ OTHER INFORMATION: /note- w 'N' represents an A or G or 
T or C polymorphism at this position" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ANANAATGCA ACCCACATAT TTATTGCCAT TAAAAGTATA GATAAAAGCA ATTTGACATC 60 

AAAAGTATCC AACATTGCAC AAGTAACTTT GTTTATCCCT CAAGCAAATC CTGATGACAT 120 

TGATCCTACT CCTACTCCTA CTCCTACTCC TGATAAAAGT CATAATTCTG GAGTTAATAT 180 

TTCTACGCTG GTATTGTCTG TGATTGGG 2 08 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 01 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: ^ngle 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

CTCCTACTCC TACTCCTGAT AAAAGTCATA ATTCTGGAGT TAATATTTCT ACGCTGGTAT 60 

TGTCTGTGAT TGGGTCTGTT GTAATTGTTA ACTTTATTTT AAGTACCACC ATTTGAACCT 120 

TAACGAAGAA AAAAATCTTC AAGTAGACCT AGAAGAGAGT TTTAAAAAAC AAAACAATGT 180 

AAGTAAAGGA TATTTCTGAA T 2 01 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 01 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
• (D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : base_polymorphism 

(B) LOCATION: 111 

(D) OTHER INFORMATION: "/note- " *N' represents an A or G or 
T or C polymorphism at this position" 
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(ix) FEATURE: 

(A) NAME /KEY : base polymorphism 

(B) LOCATION: 244 

(D) OTHER INFORMATION: /note- w 'N' represents an A or G or 
T or C polymorphism at this position" 

(ix) FEATURE: 

(A) NAME /KEY : base ^polymorphism 

(B) LOCATION: 2 84 

(D) OTHER INFORMATION: /note- w *N' represents an A or G or 
T or C polymorphism at this position" 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 



TCTGTTGTAA 
ATCTTCAAGT 
TCTGAATCTT 
TCGGAAAAGG 
TAANATTTAA 
A 



TTGTTAACTT 
AGACCTAGAA 
AAAATTCATC 
ATACTTTGAT 
TAGTTTCATT 



TATTTTAAGT 
GAGAGTTTTA 
CCATGTGTGA 
TAAATAAAAA 
TATTTGTTAT 



ACCACCATTT 
AAAAACAAAA 
TCATAAACTC 
CACTCATGGA 
TTTATTTGTA 



GAACCTTAAC 
CAATGTAAGT 
ATAAAAATAA 
TATGTAAAAA 
AGANATAGTG 



GAAGAAAAAA 
NAAGGATATT 
TTTTAAGATG 
CTGTCAAGAT 
ATGAACAAAG 



60 
120 
180 
240 
300 
301 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 229 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 15: 



G^^li SSJSiX £™£^ G AGTTTTAA ** AACAAAACAA TGTAAGTAAA 
^TATTTCT GAATCTTAAA ATTCATCCCA TGTGTGATCA TAAACTCATA AAAATAATTT 
TAAGATGTCG GAAAAGGATA CTTTGATTAA ATAAAAACAC TCATGGATAT GTAaSIS 
TCAAGATTAA AATTTAATAG TTTCATTTAT TTGTTATTTT ATTTGTAAG GTAAAAACTG 



60 
120 
180 
229 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 3 04 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CTAGAGAGGA ACAATGGGGT TATTCAGAGG TTTTGTTTTC CTCTTAGTTC TGTGCCTrrr 
^^ AGTCA AATACTTCCT TCATTAAGCT GAATAATAAT GGCTTTgSg ItATTGTCAT 
TGTTATAGAT CCTAGTGTGC CAGAAGATGA AAAAATAATT GAACAAATAG iSSSSJ 

SSJSiSS EES^J cacagaaaaI 5SSESSS 

=2 35SSSS SS ESS S3SSSX SSiSSSS „ . 
gSSS? caaaggcgaJ SSSSS SSSSS S 

GtSgSSS ii^^r ^™J A TGGACCA CCA GGCAAACTGT TTGTCCATGA 4 80 

UXLr^GCTCAC CTCCGGTGGG GAGTGTTTGA TGAGTACAAT GAAGATCAGC CTTTPT A ccc Iah 

ESSSE JJSE** 5 gtgtt ^cI S5S2SSS IS 

WVAaSSa*? ^^ G "2I CT TAGT AGAGCA TGCAGAATTG ATTCTACAAC 660 

AAAACTGTAT GGAAAAGATT GTCAATTCTT TCCTGATAAA GTACAAACAG AAAAACJPATr n on 

?SSSSSS ?S2^E tgttgaat " 5SK25SS J^c™ 7 7 

T^5?SJ?S GAGGATTTTA nt^^™ GTGCAATTTT AGAAGTACAT GGGAGGTGAT 840 

TAGCAATTCT GAGGATTTTA AAAACACCAT ACCCATGGTG ACACCACCTC CTCCACCTGT 90 0 



60 
120 
180 
240 
300 
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CTTCTCATTG 
CATGGGGGGT 
GACTGTTGAA 
AAATAAGCTA 
TACATATCCT 
TGGAGAGCTA 
TAACACTGCA 
TGCTTTGGGA 
TCATTTTTAT 
TACATCAGGA 
ACTGAATAGT 
CACGTTCTTT 
TGGAACAATA 
TCCAGGAACT 
AACATTAACT 
GAATGCTAAA 
AATTCTACAA 
GAATGGACAT 
GAATGATGGA 
AAAAGTTCGG 
TAGAGCCGCG 
ACCTGAAATT 
AGGTGCATTT 
TCAAATCACA 
ACCAGGAGAT 
TATTCTTGAT 
ACCAAAGGAG 
AAATGCAACC 
AGTATCCAAC 
TCCTACTCCT 
TACGCTGGTA 
CATTTGAACC 
CAAAACAATG 
AACTCATAAA 
ATGGATATGT 
TTGTAAGAAA 



CTGAAGATCA 
AAGGACCGCC 
AATGGATCCT 
ATCCAAATAA 
CTGGGAGGAA 
CATTCCCAAC 
AGTTCTTGTA 
AGAGCTGCTG 
GTTTCAGATG 
AATACTGATC 
AATGCCTGGA 
CTCATCACAT 
ATGGAAAATT 
GCAAAGGTGG 
ATTACAGTAA 
ATGAATAAGG 
GGATATGTAC 
ACAGAAGTTT 
GTCTACTCCA 
GCTCATGGAG 
TACATACCAG 
GATGAGGATA 
GTGGTATCAC 
GACCTTGATG 
AATTTTGATG 
CTAAGAGACA 
GCCAACTCCA 
CACATATTTA 
ATTGCACAAG 
ACTCCTACTC 
TTGTCTGTGA 
TTAACGAAGA 
TAAGTAAAGG 
AATAATTTTA 
AAAAACTGTC 
TAGTGATGAA 



GTCAAAGAAT 
TAAATCGAAT 
GGGTGGGGAT 
AAAGCAGTGA 
CTTCCATCTG 
TCGATGGATC 
TTGATGAAGT 
ATGAAGCAGT 
AAGCTCAGAA 
TCTCCCAGAA 
TGAACGACAC 
GGAACAGTCT 
TCACAGTGGA 
GCACTTGGGC 
CTTCTCGAGC 
ACGTAAACAG 
CTGTTCTTGG 
TGGAACTTTT 
GGTATTTTAC 
GAGCAAACAC 
GCTGGGTAGT 
CTCAGACCAC 
AAGTCCCAAG 
CCACAGTTCA 
TTGGAAAAGT 
GTTTTGATGA 
AGGAAAGCTT 
TTGCCATTAA 
TAACTTTGTT 
CTACTCCTGA 
TTGGGTCTGT 
AAAAAATCTT 
ATATTTCTGA 
AGATGTCGGA 
AAGATTAAAA 
CAAAGATCCT 



TGTGTGCTTA 
GAATCAAGCA 
GGTTCACTTT 
TGAAAGAAAC 
CTCTGGAATT 
CGAAGTACTG 
GAAACAAAGT 
AATAGAGATG 
CAATGGCCTC 
GTCCCTTCAG 
TGTCATAATT 
GCCTCCCAC^T 
TGCAACTTCC 
ATACAATCTT 
AGCAAATTCT 
TTTCCCCAGC 
AGCCAATGTG 
GGATAATGGT 
AGCATATACA 
TGCCAGG CTA 
GAACGGGGAA 
CTTGGAGGAT 
CCTTCCCTTG 
TGAGGATAAG 
TCAACGTTAT 
TGCTCTTCAA 
TGCATTTAAA 
AAGTATAGAT 
TATCCCTCAA 
TAAAAGTCAT 
TGTAATTGTT 
CAAGTAGACC 
ATCTTAAAAT 
AAAGGATACT 
TTTAATAGTT 
TTTTCATACT 



GTTCTTGATA 
GCAAAACATT 
GATAGTACTG 
ACACTCATGG 
AAATATGCAT 
CTGCTGACTG 
GGGGCCATTG 
AGCAAGATAA 
ATTGATGCTT 
CTCGAAAGTA 
GATAGTACAG 
ATTTCTCTCT 
AAAATGGCCT 
CAAGCCAAAG 
TCTGTGCCTC 
CCAATGATTG 
ACTGCTTTCA 
GCAGGCGCTG 
GAAAATGGCA 
AAATTACGGC 
ATTGAAGCAA 
TTCAGCCGAA 
CCTGACCAAT 
ATTATTCTTA 
ATCATAAGAA 
GTAAATACTA 
CCAGAAAATA 
AAAAGCAATT 
GCAAATCCTG 
AATTCTGGAG 
AACTTTATTT 
TAGAAGAGAG 
TCATCCCATG 
TTGATTAAAT 
TCATTTATTT 
GAT 



AGTCTGGAAG 
TCCTGCTGCA 
CCACTATTGT 
CAGGATTACC 
TTCAGGTGAT 
ATGGGGAGGA 
TTCATTTTAT 
CAGGAGGAAG 
TTGGGGCTCT 
AGGGATTAAC 
TGGGAAAGGA 
GGGATCCCAG 
ATCTCAGTAT 
CGAACCCAGA 
CAATCACAGT 
TTTACGCAGA 
TTGAATCACA 
ATTCTTTCAA 
GATATAGCTT 
CTCCACTGAA 
ACCCGCCAAG 
CAGCATCCGG 
ACCCACCAAG 
CATGGACAGC 
TAAGTGCAAG 
CTGATCTGTC 
TCTCAGAAGA 
TGACATCAAA 
ATGACATTGA 
TTAATATTTC 
TAAGTACCAC 
TTTTAAAAAA 
TGTGATCATA 
AAAAACACTC 
GTTATTTTAT 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3043 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 99 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



GCAAATTCTT 

TTCCCCAGCC 

GCCAATGTGA 

GATAATGGTG 

GCATATACAG 

GCCAGGCTAA 

AACGGGGAAA 

TTGGAGGATT 

CTTCCCTTGC 

GAGGATAAGA 

CAACGTTATA 

GCTCTTCAAG 

GCATTTAAAC 

AGTATAGATA 

ATCCCTCAAG 

AAAAGTCATA 

GTAATTGTTA 



CTGTGCCTCC 

CAATGATTGT 

CTGCTTTCAT 

CAGGCGCTGA 

AAAATGGCAG 

AATTACGGCC 

TTGAAGCAAA 

TCAGCCGAAC 

CTGACCAATA 

TTATTCTTAC 

TCATAAGAAT 

TAAATACTAC 

CAGAAAATAT 

AAAGCAATTT 

CAAATCCTGA 

ATTCTGGAGT 

ACTTTATTTT 



AATCACAGTG 

TTACGCAGAA 

TGAATCACAG 

TTCTTTCAAG 

ATATAGCTTA 

TCCACTGAAT 

CCCGCCAAGA 

AGCATCCGGA 

CCCACCAAGT 

ATGGACAGCA 

AAGTGCAAGT 

TGATCTGTCA 

CTCAGAAGAA 

GACATCAAAA 

TGACATTGAT 

TAATATTTCT 

AAGTACCACC 



AATGCTAAAA 

ATTCTACAAG 

AATGGACATA 

AATGATGGAG 

AAAGTTCGGG 

AGAGCCGCGT 

CCTGAAATTG 

GGTGCATTTG 

CAAATCACAG 

CCAGGAGATA 

ATTCTTGATC 

CCAAAGGAGG 

AATGCAACCC 

GTATCCAACA 

CCTACTCCTA 

ACGCTGGTAT 

ATTTGAACCT 



TGAATAAGGA 

GATATGTACC 

CAGAAGTTTT 

TCTACTCCAG 

CTCATGGAGG 

ACATACCAGG 

ATGAGGATAC 

TGGTATCACA 

ACCTTGATGC 

ATTTTGATGT 

TAAGAGACAG 

CCAACTCCAA 

ACATATTTAT 

TTGCACAAGT 

CTCCTACTCC 

TGTCTGTGAT 

TAACGAAGAA 



CGTAAACAGT 

TGTTCTTGGA 

GGAACTTTTG 

GTATTTTACA 

AGCAAACACT 

CTGGGTAGTG 

TCAGACCACC 

AGTCCCAAGC 

CACAGTTCAT 

TGGAAAAGTT 

TTTTGATGAT 

GGAAAGCTTT 

TGCCATTAAA 

AACTTTGTTT 

TACTCCTGAT 

TGGGTCTGTT 

AAAAATCTTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
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AAGTAGACCT 
TCTTAAAATT 
AAGGATACTT 
TTAATAGTTT 
TTTCATACTG 
AAATTGCATC 
CAAATAAACA 



AGAAGAGAGT 
CATCCCATGT 
TGATTAAATA 
CATTTATTTG 
ATACCTGGTT 
AAGAAATTAA 
ACATTTGGA 



TTTAAAAAAC 
GTGATCATAA 
AAAACACTCA 
TTATTTTATT 
GTATATTATT 
AATCATCTAT 



AAAACAATGT 
ACTCATAAAA 
TGGATATGTA 
TGTAAGAAAT 
TGATGCAACA 
CTGAGTAGTC 



AAGTAAAGGA 
ATAATTTTAA 
AAAACTGTCA 
AGTGATGAAC 
GTTTTCTGAA 
AAAATACAAG 



TATTTCTGAA 
GATGTCGGAA 
AGATTAAAAT 
AAAGATCCTT 
ATGATATTTC 
TAAAGGAGAG 



1080 
1140 
1200 
1260 
1320 
1380 
1399 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



GCCAGGAATA ACTAGAGAGG AACAATGGGG TTATTCAGAG GTTTTGTTTT CCTCTTAGTT 6 0 

CTGTGCCTGC TGCACCAGTC AAATACTTCC TTCATTAAGC TGAATAATAA TGGCTTTGAA 12 0 

GATATTGTCA TTGTTATAOA TCCTAGTGTG CCAGAAGATG AAAAAATAAT TGAACAAATA 180 

GAGGATATGG TGACTACAGC TTCTACGTAC CTGTTTGAAG CCACAGAAAA AAGATTTTTT 24 0 

TTCAAAAATG TATCTATATT AATTCCTGAG AATTGGAAGG AAAATCCTCA GTACAAAAGG 30 0 

CCAAAACATG AAAACCATAA ACATGCTGAT GTTATAGTTG CACCACCTAC ACTCCCAGGT 360 

AGAGATGAAC CATACACCAA GCAGTTCACA GAATGTGGAG AGAAAGGCGA ATACATTCAC 42 0 

TTCACCCCTG ACCTTCTACT TGAAAAAAAA CAAAATGAAT ATGGACCACC AGGCAAACTG 48 0 

TTTGTCCATG AGTGGGCTCA CCTCCGGTGG GGAGTGTTTG ATGAGTACAA TGAAGATCAG 54 0 

CCTTTCTACC GTGCTAAGTC AAAAAAAATC GAAGCAACAA GGTGTTCCGC AGGTATCTCT 60 0 

GGTAGAAATA GAGTTTATAA GTGTCAAGGA GGCAGCTGTC TTAGTAGAGC ATGCAGAATT 66 0 

GATTCTACAA CAAAACTGTA TGGAAAAGAT TGTCAATTCT TTCCTGATAA AGTACAAACA 72 0 

GAAAAAGCAT CCATAATGTT TATGCAAAGT ATTGATTCTG TTGTTGAATT TTGTAACGAA 78 0 

AAAACCCATA ATCAAGAAGC TCCAAGCCTA CAAAACATAA AGTGCAATTT TAGAAGTACA 84 0 

TGGGAGGTGA TTAGCAATTC TGAGGATTTT AAAAACACCA TACCCATGGT GACACCACCT 90 0 

CCTCCACCTG TCTTCTCATT GCTGAAGATC AGTCAAAGAA TTGTGTGCTT AGTTCTTGAT 96 0 

AAGTCTGGAA GCATGGGGGG TAAGGACCGC CTAAATCGAA TGAATCAAGC AGCAAAACAT 102 0 

TTCCTGCTGC AGACTGTTGA AAATGGATCC TGGGTGGGGA TGGTTCACTT TGATAGTACT 108 0 

GCCACTATTG TAAATAAGCT AATCCAAATA AAAAGCAGTG ATGAAAGAAA CACACTCATG 114 0 

GCAGG ATT AC CTACATATCC TCTGGGAGGA ACTTCCATCT GCTCTGGAAT TAAATATGCA 120 0 

TTTCAGGTGA TTGGAGAGCT ACATTCCCAA CTCGATGGAT CCGAAGTACT GCTGCTGACT 126 0 

GATGGGGAGG ATAACACTGC AAGTTCTTGT ATTGATGAAG TGAAACAAAG TGGGGCCATT 132 0 

GTTCATTTTA TTGCTTTGGG AAGAGCTGCT GATGAAGCAG TAATAGAGAT GAGCAAGATA 138 0 

ACAGGAGGAA GTCATTTTTA TGTTTCAGAT GAAGCTCAGA ACAATGGCCT CATTGATGCT 144 0 

TTTGGGGCTC TTACATCAGG AAATACTGAT CTCTCCCAGA AGTCCCTTCA GCTCGAAAGT 150 0 

AAGGGATTAA CACTGAATAG TAATGCCTGG ATGAACGACA CTGTCATAAT TGATAGTACA 156 0 

GTGGGAAAGG ACACGTTCTT TCTCATCACA TGGAACAGTC TGCCTCCCAG TATTTCTCTC 162 0 

TGGGATCCCA GTGGAACAAT AATGGAAAAT TTCACAGTGG ATGCAACTTC CAAAATGGCC 168 0 

TATCTCAGTA TTCCAGGAAC TGCAAAGGTG GGCACTTGGG CATACAATCT TCAAGCCAAA 174 0 

GCGAACCCAG AAACATTAAC TATTACAGTA ACTTCTCGAG CAGCAAATTC TTCTGTGCCT 1800 

CCAATCACAG TGAATGCTAA AATGAATAAG GACGTAAACA GTTTCCCCAG CCCAATGATT 186 0 

GTTTACGCAG AAATTCTACA AGGATATGTA CCTGTTCTTG GAGCCAATGT GACTGCTTTC 192 0 

ATTGAATCAC AGAATGGACA TACA*GAAGTT TTGGAACTTT TGGATAATGG TGCAGGCGCT 198 0 

GATTCTTTCA AGAATGATGG AGTCTACTCC AGGTATTTTA C AG CAT AT AC AGAAAATGGC 2 04 0 

AGATATAGCT TAAAAGTTCG GGCTCATGGA GGAGCAAACA CTGCCAGGCT AAAATTACGG 210 0 

CCTCCACTGA ATAGAGCCGC GTACATACCA GGCTGGGTAG TGAACGGGGA AATTGAAGCA 216 0 

AACCCGCCAA GACCTGAAAT TGATGAGGAT ACTCAGACCA CCTTGGAGGA TTTCAGCCGA 222 0 

ACAGCATCCG GAGGTGCATT TGTGGTATCA CAAGTCCCAA GCCTTCCCTT GCCTGACCAA 22 8 0 

TACCCACCAA GTCAAATCAC AGACCTTGAT GCCACAGTTC ATGAGGATAA GATTATTCTT 2 34 0 

ACATGGACAG CACCAGGAGA TAATTTTGAT GTTGGAAAAG TTCAACGTTA TATCATAAGA 24 0 0 

ATAAGTGCAA GTATTCTTGA TCTAAGAGAC AGTTTTGATG ATGCTCTTCA AGTAAATACT 24 6 0 

ACTGATCTGT CACCAAAGGA GGCCAACTCC AAGGAAAGCT TTGCATTTAA ACCAGAAAAT 2 52 0 

ATCTCAGAAG AAAATGCAAC CCACATATTT ATTGCCATTA AAAG TAT AG A TAAAAGCAAT 2 58 0 

TTGACATCAA AAGTATCCAA CATTGCACAA GTAACTTTGT TTATCCCTCA AGCAAATCCT 2 64 0 

GATGACATTG ATCCTACTCC TACTCCTACT CCTACTCCTG ATAAAAGTCA TAATTCTGGA 27 0 0 

GTTAATATTT CTACGCTGGT ATTGTCTGTG ATTGGGTCTG TTGTAATTGT TAACTTTATT 276 0 

TTAAGTACCA CCATTTGAAC CTTAACGAAG AAAAAAATCT TCAAGTAGAC CTAGAAGAGA 2 82 0 
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GTTTTAAAAA 
GTGTGATCAT 
TAAAAACACT 
TGTTATTTTA 
TTGTATATTA 
AAAATCATCT 
A 



ACAAAACAAT 
AAACTCATAA 
CATGGATATG 
TTTGTAAGAA 
TTTGATGCAA 
ATCTGAGTAG 



GTAAGTAAAG 
AAATAATTTT 
TAAAAACTGT 
ATAGTGATGA 
CAGTTTTCTG 
TCAAAATACA 



GATATTTCTG 
AAGATGTCGG 
CAAGATTAAA 
ACAAAGATCC 
AAATGATATT 
AGTAAAGGAG 



AATCTTAAAA 
AAAAGGATAC 
ATTTAATAGT 
TTTTTCATAC 
TCAAATTGCA 
AGCAAATAAA 



TTCATCCCAT 
TTTGATTAAA 
TTCATTTATT 
TGATACCTGG 
TCAAGAAATT 
CAACATTTGG 



2660 
2940 
3000 
3060 
3120 
3180 
3181 



(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

AGCTCGGAAT TCCGAGCTTG GATCCTCTAG AGCGGCCGCC GACTAGTGAG CTCGTCGACC 
CGGGAATT 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

AATTAATTCC CGGGTCGACG AGCTCACTAG TCGGCGGCCG CTCTAGAGGA TCCAAGCTCG 
GAATTCCG 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AGCGGATAAC AATTTCACAC AGGA 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

TGTAAAACGA CGGCCAGT 18 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 2 0 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
ID) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CTGCCAGGCT AAAATTACGG 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS ■ 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24 
ATCACAGACC TTGATGCCAC 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
GCTGGTATTG TCTGTGATTG GGTC 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
CATCAGGATT TGCTTGAGGG 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 2 0 base piirs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TATTGGTCAG GCAAGGGAAG 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
GTGTTTGCTC CTCCATGAGC 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid ' 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 
CAAGTAGAAG GTCAGGGGTG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
ATAAGTGTCA AGGAGGCAGC 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GCAGACTGTT CCATGTGATG 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
ATGTACCTGT TCTTGGAGCC 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
ACGTACCTGT TTGAAGCCAC 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
GGTAAGGACC GCCTAAATCG 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GAAGTGAAAC AAAGTGGGGC 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
TTATCCTCCC CATCAGTCAG 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic ac*id 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TCGATTTAGG CGGTCCTTAC 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 
TGTGGCTTCA AACAGGTACG 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GGGTAAGGAC CGCCTAAATC GAATG 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GAGCCCCAAA AGCATCAATG AGG 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 917 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 



Met 


Gly 


Leu 


Phe 


Arg 


Gly 


Phe 


Val 


Phe 


Leu 


Leu Val Leu Cys Leu Leu 


1 








5 










10 


15 


His 


Gin 


Ser 


Asn 


Thr 


Ser 


Phe 


lie 


Lys 


Leu Asn Asn Asn Gly Phe Glu 








20 










25 




30 


Asp 


He 


Val 


He 


Val 


He 


Asp 


Pro 


Ser 


Val 


Pro Glu Asp Glu Lys lie 






35 










40 






45 


He 


Glu 


Gin 


He 


Glu 


Asp 


Met 


Val 


Thr 


Thr 


Ala Ser Thr Tyr Leu Phe 




50 










55 








60 


Glu 


Ala 


Thr 


Glu 


Lys 


Arg 


Phe 


Phe 


Phe 


Lys 


Asn Val Ser He Leu He 


65 










70 










75 80 


Pro 


Glu 


Asn 


Trp 


Lys 


Glu 


Asn 


Pro 


Gin 


Tyr 


Lys Arg Pro Lys His Glu 










85 










90 


95 


Asn 


His 


Lys 


His 


Ala 


Asp 


Val 


He 


Val 


Ala 


Pro Pro Thr Leu Pro Gly 








100 










105 




110 


Arg 


Asp 


Glu 


Pro 


Tyr 


Thr 


Lys 


Gin 


Phe 


Thr 


Glu Cys Gly Glu Lys Gly 






115 










120 






125 


Glu 


Tyr 


He 


His 


Phe 


Thr 


Pro 


Asp 


Leu 


Leu 


Leu Glu Lys LyB Gin Asn 




130 










135 








140 


Glu 


Tyr 


Gly 


Pro 


Pro 


Gly 


Lys 


Leu 


Phe 


Val 


His Glu Trp Ala His Leu 


145 










150 










155 160 
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Arg Trp Gly Val Phe Asp Glu Tyr Asn Glu Asp Gin Pro Phe Tyr Arg 

165 170 175 

Ala Lys Ser Lys Lys He Glu Ala Thr Arg Cys Ser Ala Gly He Ser 

180 185 " 190 

Gly Arg Asn Arg Val Tyr Lys Cys Gin Gly Gly Ser Cys Leu Ser Arq 

195 200 205 

Ala Cys Arg He Asp Ser Thr Thr Lys Leu Tyr Gly Lys Asp Cys Gin 

210 215 220 

Phe Phe Pro Asp Lys Val Gin Thr Glu Lys Ala Ser He Met Phe Met 
225 230 235 240 

Gin Ser lie Asp Ser Val Val Glu Phe Cys Asn Glu Lys Thr His Asn 

245 250 J 255 

Gin Glu Ala Pro Ser Leu Gin Asn He Lys Cys Asn Phe Arg Ser Thr 

260 265 270 

Trp Glu Val He Ser Asn Ser Glu Asp Phe Lys Asn Thr He Pro Met 

2? 5 280 285 

Val Thr Pro Pro Pro Pro Pro Val Phe Ser Leu Leu Lys He Ser Gin 

290 295 300 

Arg He Val Cys Leu Val Leu Asp Lys Ser Gly Ser Met Gly Gly Lys 
305 310 315 320 

Asp Arg Leu Asn Arg Met Asn Gin Ala Ala Lys His Phe Leu Leu Gin 

325 330 335 

Thr Val Glu Asn Gly Ser Trp Val Gly Met Val His Phe Asp Ser Thr 

340 345 350 

Ala Thr lie Val Asn Lys Leu He Gin lie Lys Ser Ser Asp Glu Arg 

355 360 365 

Asn Thr Leu Met Ala Gly Leu Pro Thr Tyr Pro Leu Gly Gly Thr Ser 

370 375 380 

He Cys Ser Gly He Lys Tyr Ala Phe Gin Val He Gly Glu Leu His 
385 390 395 400 

Ser Gin Leu Asp Gly Ser Glu Val Leu Leu Leu Thr Asp Gly Glu Asp 

405 410 415 

Asn Thr Ala Ser Ser Cys lie Asp Glu Val Lys Gin Ser Gly Ala He 

420 425 430 

Val His Phe He Ala Leu Gly Arg Ala Ala Asp Glu Ala Val lie Glu 

435 440 445 

Met Ser Lys lie Thr Gly Gly Ser His Phe Tyr Val Ser Asp Glu Ala 

450 455 460 

Gin Asn Asn Gly Leu He Asp Ala Phe Gly Ala Leu Thr Ser Gly Asn 
465 470 475 480 

Thr Asp Leu Ser Gin Lys Ser Leu Gin Leu Glu Ser Lys Gly Leu Thr 

485 490 495 

Leu Asn Ser Asn Ala Trp Met Asn Asp Thr Val He lie Asp Ser Thr 

500 505 510 

Val Gly Lys Asp Thr Phe Phe Leu He Thr Trp Asn Ser Leu Pro Pro 

515 520 525 

Ser lie Ser Leu Trp Asp Pro Ser Gly Thr lie Met Glu Asn Phe Thr 

530 535 540 

Val Asp Ala Thr Ser Lys Met Ala Tyr Leu Ser lie Pro Gly Thr Ala 
545 550 555 560 

Lys Val Gly Thr Trp Ala Tyr Asn Leu Gin Ala Lys Ala Asn Pro Glu 

565 ' 570 575 

Thr Leu Thr He Thr Val Thr Ser Arg Ala Ala Asn Ser Ser Val Pro 

580 585 590 

Pro lie Thr Val Asn Ala Lys Met Asn Lys Asp Val Asn Ser Phe Pro 

595 600 605 

Ser Pro Met He Val Tyr Ala Glu He Leu Gin Gly Tyr Val Pro Val 

610 615 620 

Leu Gly Ala Asn Val Thr Ala Phe He Glu Ser Gin Asn Gly His Thr 
625 * 630 635 640 

Glu Val Leu Glu Leu Leu Asp Asn Gly Ala Gly Ala Asp Ser Phe Lys 

645 650 655 

Asn Asp Gly Val Tyr Ser Arg Tyr Phe Thr Ala Tyr Thr Glu Asn Gly 

660 665 * 670 

Arg Tyr Ser Leu Lys Val Arg Ala His Gly Gly Ala Asn Thr Ala Arg 
675 680 685 
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Leu 


Lys 


Leu Arg 


Pro 


Pro 


Leu Asn 


Arg Ala 


Ala Tyr He 


Pro 


Gly Trp 




690 










695 




700 




Val 


Val Asn Gly 


Glu 


He 


Glu Ala 


Asn Pro 


Pro Arg Pro 


Glu 


He Asp 


705 










710 






715 




720 


Glu 


Asp 


Thr 


Gin 


Thr 


Thr 


Leu Glu 


Asp Phe 


Ser Arg Thr 


Ala 


Ser Gly 










725 






730 






735 


Gly Ala 


Phe 


Val 


Val 


Ser 


Gin Val 


Pro Ser 


Leu Pro Leu 


Pro 


Asp Gin 








740 








745 




750 


Tyr 


Pro 


Pro 


Ser 


Gin 


He 


Thr Asp 


Leu Asp 


Ala Thr Val 


His 


Glu Asp 






755 








760 




765 




Lys 


He 


He 


Leu 


Thr 


Trp 


Thr Ala 


Pro Gly 


Asp Asn Phe Asp 


Val Gly 




770 










775 




780 




Lys 


Val 


Gin 


Arg 


Tyr 


He 


He Arg 


He Ser 


Ala Ser He 


Leu 


Asp Leu 


785 










790 






795 




800 


Arg Asp Ser 


Phe 


Asp 


Asp Ala Leu Gin Val 


Asn Thr Thr Asp 


Leu Ser 










805 






810 






815 


Pro Lys Glu Ala 


Asn 


Ser 


Lys Glu 


Ser Phe 


Ala Phe Lys 


Pro 


Glu Asn 








820 








825 


830 




He 


Ser 


Glu 


Glu 


Asn 


Ala 


Thr His 


He Phe 


He Ala He 


Lys 


Ser He 






835 








840 




845 




Asp 


Lys 


Ser 


Asn 


Leu 


Thr 


Ser Lys 


Val Ser 


Asn He Ala 


Gin 


Val Thr 




850 










855 




860 






Leu 


Phe 


He 


Pro 


Gin 


Ala 


Asn Pro 


Asp Asp 


He Asp Pro Thr 


Pro Thr 


865 










870 






875 




880 


Pro 


Thr 


Pro 


Thr 


Pro 


Asp 


Lys Ser 


His Asn 


Ser Gly Val 


Asn 


He Ser 










885 






890 




895 


Thr 


Leu 


Val 


Leu 


Ser 


Val 


He Gly Ser Val 


Val He Val 


Asn 


Phe lie 



900 905 910 



Leu Ser Thr Thr He 
915 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Ala Asn Ser Ser Val Pro Pro He Thr Val Asn Ala Lys Met Asn Lys 

15 10 15 

Asp Val Asn Ser Phe 
20 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

Asp Asn Gly Ala Gly Ala Asp Ser Phe Lys Asn Asp Gly Val Tyr Ser 

15 10 15 

Arg Tyr Phe Thr Ala Tyr Thr Glu Asn Gly Arg Tyr Ser Leu Lys 
20 25 ~ 30 

(2) INFORMATION FOR SEQ ID NO: 44: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Val Arg Ala His Gly Gly Ala Asn Thr Ala Arg Leu Lys Leu Arg Pro 

1 5 io 15 

Pro Leu Asn Arg Ala Ala Tyr He 
20 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Ser Leu Pro Leu Pro Asp Gin Tyr Pro Pro Ser Gin He Thr Asp Leu 

1 5 io 15 

Asp Ala Thr Val His Glu Asp Lys He He Leu Thr Trp Thr Ala Pro 

20 25 30 

Gly Asp Asn Phe Asp Val Gly Lys 
35 40 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

Tyr Asn Glu Asp Gin Pro Phe Tyr Arg Ala Lys Ser Lys Lys He Glu 

1 5 10 15 

Ala Thr Arg Cys 

20 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
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Leu Ser Arg Ala Cys Arg lie Asp Ser Thr Thr Lys Leu Tyr Gly Lys 

1 S 10 15 

Asp Cys Gin Phe Phe Pro Asp Lys 
20 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

Lys Ser Ser Asp Glu Arg Asn Thr Leu Met Ala Gly Leu Pro Thr Tyr 

1 5 10 15 

Pro Leu Gly Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

Glu lie Asp Glu Asp Thr Gin Thr Thr Leu Glu Asp Phe Ser Arg 
15 10 " 15 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: 

Asp Tyr Lys Asp Asp Asp Asp Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO:51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
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Glu Gin Lys Leu lie Ser Glu Glu 

1 5 
His His His His His 
20 



Asp Leu Asn Met His Thr Glu His 
10 15 
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We Claim: 

1 . A purified polynucleotide or fragment thereof derived from a CS 193 
5 gene, wherein said polynucleotide is capable of selectively hybridizing to the nucleic 
acid of said CS 1 93 gene and has at least 50% identity to a sequence selected from the 
group consisting of SEQUENCE ID NOS 1-18, and fragments or complements 
thereof. 

10 2 . The purified polynucleotide of claim 1 , wherein said polynucleotide is 

produced by recombinant techniques. 

3 . The purified polynucleotide of claim I , wherein said polynucleotide is 
produced by synthetic techniques. 

15 

4. The purified polynucleotide of claim I, wherein said polynucleotide 
comprises a sequence encoding at least one CSI93 epitope. 

5 . A recombinant expression system comprising a nucleic acid sequence 
20 (hat includes an open reading frame derived from CS193 operably linked to a control 

sequence compatible with a desired host, wherein said nucleic acid sequence has at least 
50% identity to a sequence selected from the group consisting of SEQUENCE ID NOS 
1-18 and fragments or complements thereof. 

25 6. A cell transfected with the recombinant expression system of claim 5. 

7 . A CS 1 93 polypeptide having at least 60% identity with an amino acid 
sequence selected from the group consisting of SEQUENCE ID NOS 4 1 -49, and 
fragments thereof. 

30 

8. The polypeptide of claim 7, wherein said polypeptide is produced by 
recombinant techniques. 

9. The polypeptide of claim 7, wherein said polypeptide is produced by 
35 synthetic techniques. 
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10. An antibody which specifically binds to at least one CS193 epitope, 
wherein said CS 193 epitope is derived from an amino acid sequence having at least 
50% identity to an amino acid sequence selected from the group consisting of 
SEQUENCE ID NOS 41-49, and fragments thereof. 

5 

11. A cell transfcclcd with a nucleic acid sequence encoding at least one 
CS 193 epitope, wherein said nucleic acid sequence is selected from the group 
consisting of SEQUENCE ID NOS 1-18, and fragments or complements thereof. 

10 12. A method for producing a polypeptide comprising at least one CS 1 93 

epitope, said method comprising incubating host cells that have been transfected with an 
expression vector containing a polynucleotide sequence encoding a polypeptide, 
wherein said polypeptide comprises an amino acid sequence having at least 60% 
identity with an amino acid sequence selected from the group consisting of 

1 5 SEQUENCE ID NOS 4 1 -49, and fragments thereof. 

13. A method for producing antibodies which specifically bind to CS193 
antigen, said method comprising administering to an individual an isolated 
immunogenic polypeptide or fragment thereof in an amount sufficient to elicit an 

20 immune response, wherein said immunogenic polypeptide comprises at least one 

CS 1 93 epitope and has at least 50% identity with a sequence selected from the group 
consisting of SEQUENCE ID NOS 4 1 -49, and fragments thereof. 

14. A method for producing antibodies which specifically bind to CS193 
25 antigen, said method comprising administering to an individual a plasmid comprising a 

polynucleotide sequence which encodes at least one CS193 epitope derived from a 
polypeptide having an amino acid sequence selected from the group consisting of 
SEQUENCE ID NOS 41-49, and fragments thereof. 

30 15. A composition of matter comprising a CS 1 93 polynucleotide or 

fragment thereof, wherein said polynucleotide has at least 50% identity with a 
polynucleotide selected from the group consisting of SEQUENCE ID NOS 1-18, and 
fragments or complements thereof. 

35 1 6. A composition of matter comprising a polypeptide containing at least one 

CS 19? epitope, wherein said polypeptide has at least 60% identity with a sequence 
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selected from the group consisting of SEQUENCE ID NOS 4 1 -49, and fragments 
thereof. 

17. A gene, or a fragment thereof, which codes for a CS 1 93 protein 
comprising an amino acid sequence that has at least 60% identity with SEQUENCE ID 
NO 41. 



18. A gene or fragment thereof comprising DNA having at least 50% 
identity with SEQUENCE ID NO 16, SEQUENCE ID NO 17, or SEQUENCE ID NO 
18. 
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REAGENTS AND METHODS I KEFIJT. FOR PFTFfriNP. 
DISEASES O F THE GASTRQINTESTINAI . TR APT 



Abstract of the Disclosure 

A set of contiguous and partially overlapping cDNA sequences and polypeptides 
encoded thereby, designated as CS193 and transcribed from GI tract tissue, are 
described. These sequences are useful for the delecting, diagnosing, staging, 
monitoring, prognosticating, preventing or treating, or determining the predisposition 
of an individual to diseases and conditions of the GI tract, such as GI tract cancer. Also 
provided are antibodies which specifically bind to CS 193-encoded polypeptide or 
protein, and agonists or inhibitors which prevent action of the tissue-specific CSI93 
polypeptide, which molecules are useful for the therapeutic treatment of GI tract 
diseases, tumors or metastases. 




>2767646 
>774134 
>774134IH 
Consensus 

>2767646 
>774134 
>774134IH 
Consensus 

>2767646 
>774134 
>774134IH 
Consensus 

>2767646 
>774134 
>774134IH 
Consensus 

>2767646 
>774134 
>774134IH 
Consensus 

>774134IH 
Consensus 

>774134IH 
Consensus 

>774134IH 
Consensus 

>774134IH 
Consensus 

>774134IH 
Consensus 

>774134IH 
Consensus 

>774134IH 
Consensus 

>774134IH 
Consensus 



Figure 1-A 



GCCAGGAATA ACTAGAGAGG AACAATGGGG TTATTCAGAG GTTTTGTTTT 
CTAGAGAGG AACAATGGGG TTATTCAGAG GTTTTGTTTT 
CTAGAGAGG AACAATGGGG TTATTCAGAG GTTTTGTTTT 

GCCAGGAATA ACTAGAGAGG AACAATGGGG TTATTCAGAG GTTTTGTTTT 

CCTCTTAGTT CTGTGCCTGC TGCACCAGTC AAATACTTCC TTCATTAAGC 
CCTCTTAGTT CTGTGCCTGC TGCACCAGTC AAATACTTCC TTCATTAAGC 
CCTCTTAGTT CTGTGCCTGC TGCACCAGTC AAATACTTCC TTCATTAAGC 
CCTCTTAGTT CTGTGCCTGC TGCACCAGTC AAATACTTCC TTCATTAAGC 

TGAATAATAA TGGCTTTGAA GATATTGTCA TTGTTATAGA TCCTAGTGTG 
TGAATAATAA TGGCTTTGAA GATATTGTCA TTGTTATAGA TCCTAGTGTG 
TGAATAATAA TGGCTTTGAA GATATTGTCA TTGTTATAGA TCCTAGTGTG 
TGAATAATAA TGGCTTTGAA GATATTGTCA TTGTTATAGA TCCTAGTGTG 

CCACAAGATG AAAAAATAAT TGAACAAATA GAGGATATGG TGACTACAGC 
CCAGAAGATG AAAAAATAAT TGAACAAATA GAGGATATGG TGACTACAGC 
CCAGAAGATG AAAAAATAAT TGAACAAATA GAGGATATGG TGACTACAGC 
CCAGAAGATG AAAAAATAAT TGAACAAATA GAGGATATGG TGACTACAGC 

TTCTACGTAC CTGTTTGAAG CCACAGAAAA AAGATTTTTT T 
TTCTACGTAC CTGTTTGAAG CCACAGAAAA 

TTCTACGTAC CTGTTTGAAG CCACAGAAAA AAGATTTTTT TTCAAAAATG 
TTCTACGTAC CTGTTTGAAG CCACAGAAAA AAGATTTTTT TTCAAAAATG 

TATCTATATT AATTCCTGAG AATTGGAAGG AAAATCCTCA GTACAAAAGG 
TATCTATATT AATTCCTGAG AATTGGAAGG AAAATCCTCA GTACAAAAGG 

CCAAAACATG AAAACCATAA ACATGCTGAT GTTATAGTTG CACCACCTAC 
CCAAAACATG AAAACCATAA ACATGCTGAT GTTATAGTTG CACCACCTAC 

ACTCCCAGGT AGAGATGAAC CATACACCAA GCAGTTCACA GAATGTGGAG 
ACTCCCAGGT AGAGATGAAC CATACACCAA GCAGTTCACA GAATGTGGAG 

AGAAAGGCGA ATACATTCAC TTCACCCCTG ACCTTCTACT TGAAAAAAAA 
AGAAAGGCGA ATACATTCAC TTCACCCCTG ACCTTCTACT TGAAAAAAAA 

CAAAATGAAT ATGGACCACC AGGCAAACTG TTTGTCCATG AGTGGGCTCA 
CAAAATGAAT ATGGACCACC AGGCAAACTG TTTGTCCATG AGTGGGCTCA 

CCTCCGGTGG GGAGTGTTTG ATGAGTACAA TGAAGATCAG CCTTTCTACC 
CCTCCGGTGG GGAGTGTTTG ATGAGTACAA TGAAGATCAG CCTTTCTACC 

GTGCTAAGTC AAAAAAAATC GAAGCAACAA GGTGTTCCGC AGGTATCTCT 
GTGCTAAGTC AAAAAAAATC GAAGCAACAA GGTGTTCCGC AGGTATCTCT 

GGTAGAAATA GAGTTTATAA GTGTCAAGGA GGCAGCTGTC TTAGTAGAGC 
GGTAGAAATA GAGTTTATAA GTGTCAAGGA GGCAGCTGTC TTAGTAGAGC 
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Figure 1-B 



ATGCAGAATT GATTCTACAA CAAAACTGTA TCGAAAAGAT TGTCAATTCT 
ATGCAGAATT GATTCTACAA CAAAACTGTA TCGAAAAGAT TGTCAATTCT 

TTCCTGATAA AGTACAAACA GAAAAAGCAT CCATAATGTT TATGC AAAGT 
TTCCTGATAA AGTACAAACA GAAAAAGCAT CCATAATGTT TATGC AAAGT 

ATTGATTCTC TTGTTGAATT TTGTAACGAA AAAACCCATA ATCAAGAAGC 

TT NTGTAACGAA AAAACCCATA ATCAAGAAGC 
ATTGATTCTG TTGTTGAATT TTGTAACGAA AAAACCCATA ATCAAGAAGC 

TCCAAGCCTA CAAAACATAA AGTGCAATTT TAGAAGTACA TGGGAGGTGA 
TCCAAGCCTA CAAAACATAA AGTGCAATTT TAGAAGTACA TGGGAGGTGA 
TCCAAGCCTA CAAAACATAA AGTGCAATTT TAGAAGTACA TGGGAGGTGA 

TTAGCAATTC TGAGGATTTT AAAAACACCA TACCCATGGT GACACCACCT 
TTAGCAATTC TGAGGATTTT AAAAACACCA TACCCATGGT GACACCACCT 
TTAGCAATTC TGAGGATTTT AAAAACACCA TACCCATGGT GACACCACCT 

CCTCCACCTG TCTTCTCATT GCTGAAGATC AGTCAAAGAA TTGTGTGCTT 
CCTCCACCTG TCTTCTCATT GCTGAAGATC AGTCAAAGAA TTGTGTGCTT 
CCTCCACCTG TCTTCTCATT GCTGAAGATC AGTCAAAGAA TTGTGTGCTT 

AGTTCTTGAT AAGTCTGGAA GCATGGGGGG TAAGGACCGC CTAAATCGAA 
AGTTCTTGAT AAGTCTGGAA GCATGGGGGG TAAGGACCGC CTAAATCGA 

TGGGGGG TAAGGACCGC CTAAATCGAA 
AGTTCTTGAT AAGTCTGGAA GCATGGGGGG TAAGGACCGC CTAAATCGAA 

TGAATCAAGC AGCAAAACAT TTCCTGCTGC AGACTGTTGA AAATGGATCC 
TGAATCAAGC AGCAAAACAT TTCCTGCTGC AGACTGTTGA AAATGGATCC 
TGAATCAAGC AGCAAAACAT TTCCTGCTGC AGACTGTTGA AAATGGATCC 

TGGGTGGGGA TGGTTC ACTT TGATAGTACT GCCACTATTG TAAATAAGCT 
TGGGTGGGGA TGGTTCACTT TGATAGTACT GCCACTATTG TAAATAAGCT 
TGGGTGGGGA TGGTTCACTT TGATAGTACT GCCACTATTG TAAATAAGCT 

AATCC AAAT A AAAAGCAGTG ATGAAAGAAA CACACTCATG GCAGGATTAC 
AATCCAAATA AAAAGCAGTG ATGAAAGAAA CACACTCATG GCAGGATTAC 
AATCC AAAT A AAAAGCAGTG ATGAAAGAAA CACACTCATG GCAGGATTAC 

CTACATATCC TCTGGGAGGA ACTTCCATCT CCTCTGGAAT T AAAT ATGC A 
CTACATATCC TCTGGGAGGA ACTTCCATCT GCTCTCGAAT T AAAT ATGC A 
* CTTCCATCT GCTCTGGAAT TAAATATGCA 

CTACATATCC TCTGGGAGGA ACTTCCATCT GCTCTGGAAT TAAATATGCA 

TTTCAGGTGA TTGGAGAGCT ACATTCCCAA CTCGATGGAT CCGAAGTACT 
TTTCAGGTGA 

TTTCAGGTGA TTGGAGAGCT ACATTCCCAA CTCGATGGAT CCGAAGTACT 
TTTCAGGTGA TTGGAGAGCT ACATTCCCAA CTCGATGGAT CCGAAGTACT 



Figure 1-C 
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GCTGCTGACT GATGGGGAGG ATAACACTGC AAGTTCTTGT ATTGATGAAG 
GCTGCTGACT GATGGGGAGG ATAACACTGC AAGTTCTTGT ATTGATGAAG 
GCTGCTGACT GATGGGGAGG ATAACACTGC AAGTTCTTGT ATTGATGAAG 

TGAAACAAAG TGGGGCCATT GTTCATTTTA TTGCTTTGGG AAGAGCTGCT 
TGAAACAAAG TGGGGCCATT GTTCATTTTA TTGCTTTGGG AAGAGCTGCT 
TGAAACAAAG TGGGGCCATT GTTCATTTTA TTGCTTTGGG AAGAGCTGCT 

* 

GATGAAGCAG TAATAGAGAT GAGCAAGATA ACAGGAGGAA GTCATTTTTA 
GATGAAGCAG TAATAGAGAT GAGCAAGATA ACAGGAG 

GATGAAGCAG TAATAGAGAT GAGCAAGATA ACAGGAGGAA GTCATTTTTA 

TGTTTCAGAT GAAGCTCAGA ACAATGGCCT CATTGATGCT TTTGGGGCTC 
TGTTTCAGAT GAAGCTCAGA ACAATGGCCT CATTGATGCT TTTGGGGCTC 

TTACATCAGG AAATACTGAT CTCTCCCAGA AGTCCCTTCA GCTCGAAAGT 
TTACATCAGG AAATACTGAT CTCTCCCAGA AGTCCCTTCA GCTCGAAAGT 

AAGGGATTAA CACTGAATAG TAATGCCTGG ATGAACGACA CTGTCATAAT 

AAT 

AAGGGATTAA CACTGAATAG TAATGCCTGG ATGAACGACA CTGTCATAAT 

TGATAGTACA GTGGGAAAGG ACACGTTCTT TCTCATCACA TGGAACAGTC 
TGATAGTACA GTGGGAAAGG NCACGTTCTT TCTCATCACA TGGAACAGTC 
TGATAGTACA GTGGGAAAGG ACACGTTCTT TCTCATCACA TGGAACAGTC 



>774134IH TGCCTCCCAG TATTTCTCTC TGGGATCCCA GTGGAACAAT AATGGAAAAT 

>1286372 TGCCTCCCAG TATTTCTCTC TGGGATCCCA GTGGAACAAT AATGGAAAAT 

Consensus TGCCTCCCAG TATTTCTCTC TGGGATCCCA GTGGAACAAT AATGGAAAAT 

>774134IH TTCACAGTGG ATGCAACTTC CAAAATGGCC TATCTCAGTA TTCCAGGAAC 

>1286372 TTCACAGTGG ATGCAACTTC CAAAATGGCC TATCTCAGTA TTCCAGGAAC 

Consensus TTCACAGTGG ATGCAACTTC CAAAATGGCC TATCTCAGTA TTCCAGGAAC 
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TGCAAAGGTG GGCACTTGGG CATACAATCT TCAAGCCAAA GCGAACCCAG 
TGCAAAGGTG GGCACTTGGG CATACAATCT TCAAGCCAAA GCGAACCC 
TGCAAAGGTG GGCACTTGGG CATACAATCT TCAAGCCAAA GCGAACCCAG 

AAACATTAAC TATTACAGTA ACTTCTCGAG CAGCAAATTC TTCTGTGCCT 

GCAAATTC TTCTGTGCCT 
GCAAATTC TTCTGTGCCT 

AAACATTAAC TATTACAGTA ACTTCTCGAG CAGCAAATTC TTCTGTGCCT 

CCAATCACAG TGAATGCTAA AATGAATAAG GACGTAAACA GTTTCCCCAG 
CCAATCACAG TGAATGCTAA AATGAATAAG GACGTAAACA GTTTCCCCAG 
CCAATCACAG TGAATGCTAA AATGAATAAG GACGTAAACA GTTTCCCCAG 
CCAATCACAG TGAATGCTAA AATGAATAAG GACGTAAACA GTTTCCCCAG 



Figure 1-D 



>774134IH CCCAATGATT GTTTACGCAG AAATTCTACA AGGATATGTA CCTGTTCTTG 

>774419 CCCAATGATT GTTTACGCAG AAATTCTACA AGGATATGTA CCTGTTCTTG 

>774419IH CCCAATGATT GTTTACGCAG AAATTCTACA AGGATATGTA CCTGTTCTTG 

Consensus CCCAATGATT GTTTACGCAG AAATTCTACA AGGATATGTA CCTGTTCTTG 

>77413 4IH GAGCCAATGT GACTGCTTTC ATTGAATCAC AGAATGGACA TACAGAAGTT 

>774419 GAGCCAATGT GACTGCTTTC ATTGAATCAC AGAATGGACA TACAGAAGTT 

>774419IH GAGCCAATGT GACTGCTTTC ATTGAATCAC AGAATGGACA TACAGAAGTT 

Consensus GAGCCAATGT GACTGCTTTC ATTGAATCAC AGAATGGACA TACAGAAGTT 

>774134IH TTGGAACTTT TGG AT AATGG TGCAGGCGCT GATTCTTTCA AG AATG ATGG 

>774 419 TTGGAACTTT TGG AT AATGG TGCAGGCGCT GATTCTTTCA AG AATG ATGG 

>774419IH TTGGAACTTT TGG AT AATGG TGCAGGCGCT GATTCTTTCA AG AATG ATGG 
>3 233118 G TGCAGGCGCT GATTCTTTCA AG AATG ATGG 

Consensus TTGGAACTTT TGGATAATGG TGCAGGCGCT GATTCTTTCA AG AATG ATGG 

>774134IH AGTCTACTCC AGGTATTTTA CAGCATATAC AGAAAATGGC AGATATAGCT 

>774419 AGTCTACTCC AGGTATTTTA CAG 

>774 419IH AGTCTACTCC AGGTATTTTA CAGCATATAC AGAAAATGGC AGATATAGCT 

>3233118 AGTCTACTCC AGGTATTTTA CAGCATATAC AGAAAATGGC AGATATAGCT 

Consensus AGTCTACTCC AGGTATTTTA CAGCATATAC AGAAAATGGC AGATATAGCT 

>774134IH TAAAAGTTCG GGCTCATGGA GGAGCAAACA CTGCCAGGCT AAAATTACGG 

>774 419IH TAAAAGTTCG GGCTCATGGA GGAGCAAACA CTGCCAGGCT AAAATTACGG 

>3 23 3 118 TAAAAGTTCG GGCTCATGGA GGAGCAAACA CTGCCAGGCT AAAATTACGG 

Consensus TAAAAGTTCG GGCTCATGGA GGAGCAAACA CTGCCAGGCT AAAATTACGG 

>774134IH CCTCCACTGA ATAGAGCCGC GTACATACCA GGCTGGGTAG TGAACGGGGA 

>77 4 419IH CCTCCACTGA ATAGAGCCGC GTACATACCA GGCTGGGTAG TGAACGGGGA 

>3 233 118 CCTCCACTGA ATAGAGCCGC GTACATACCA GGCTGGGTAG TGAACGGGGA 

Consensus CCTCCACTGA ATAGAGCCGC GTACATACCA GGCTGGGTAG TGAACGGGGA 
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AATTGAAGCA AACCCGCCAA GACCTGAAAT TGATGAGGAT ACTCAGACCA 
AATTGAAGCA AACCCGCCAA GACCTGAAAT TGATGAGGAT ACTCAGACCA 
AATTGAAGCA AACCCGCCAA GACCTGAAAT TGATGAGGAT ACTCAGACCA 
CCCGCCAA GACCTGAAAT TGATGAGGAT ACTCAGACCA 
AATTGAAGCA AACCCGCCAA GACCTGAAAT TGATGAGGAT ACTCAGACCA 



>77413 4IH CCTTGGAGGA TTTCAGCCGA ACAGCATCCG GAGGTGCATT TGTGGTATCA 

>774419IH CCTTGGAGGA TTTCAGCCGA ACAGCATCCG GAGGTGCATT TGTGGTATCA 

>3 23 3 118 CCTTGGAGGA T 

>2733923 CCTTGGAGGA TTTCAGCCGA ACAGCATCCG GAGGTGCATT TGTGGTATCA 

Consensus CCTTGGAGGA TTTCAGCCGA ACAGCATCCG GAGGTGCATT TGTGGTATCA 
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CAAGTCCCAA GCCTTCCCTT GCCTGACCAA TACCCACCAA GTCAAATCAC 
CAAGTCCCAA GCCTTCCCTT GCCTGACCAA TACCCACCAA GTCAAATCAC 
CAAGTCCCAA GCCTTCCCTT GCCTGACCAA TACCCACCAA GTCAAATCAC 

CCAA TACCCACCAA GTCAAATNAC 
CAAGTCCCAA GCCTTCCCTT GCCTGACCAA TACCCACCAA GTCAAATCAC 



Figure 1-E 
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CATTGCACAA GTAACTTTGT TTATCCCTCA AGCAAATCCT GATGACATTG 
CATTGCACAA GTAACTTTGT TTATCCCTCA AGCAAATCCT GATGACATTG 
CATTGCACAA GTAACTTTGT TTATCCCTCA AGCAAATCCT GATGACATTG 
CATTGCACAA GTAACTTTGT TTATCCCTCA AGCAAATCCT GATGACATTG 
CATTGCACAA GTAACTTTGT TTATCCCTCA AGCAAATCCT GATGACATTG 

ATCCTACTCC TACTCCTACT CCTACTCCTG ATAAAAGTCA TAATTCTGGA 
ATCCTACTCC TACTCCTACT CCTACTCCTG ATAAAAGTCA TAATTCTGGA 
ATCCTACTCC TACTCCTACT CCTACTCCTG ATAAAAGTCA TAATTCTGGA 
CTCCTACT CCTACTCCTG ATAAAAGTCA TAATTCTGGA 
ATCCTACTCC TACTCCTACT CCTACTCCTG ATAAAAGTCA TAATTCTGGA 
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Figure 1-F 



GTTAATATTT CTACGCTGGT ATTGTCTGTG ATTGGGTCTG TTGTAATTGT 
GTTAATATTT CTACGCTGGT ATTGTCTGTG ATTGGGTCTG TTGTAATTGT 
GTTAATATTT CTACGCTGGT ATTGTCTGTG ATTGGG 

GTTAATATTT CTACGCTGGT ATTGTCTGTG ATTGGGTCTG TTGTAATTGT 

TCTG TTGTAATTGT 

GTTAATATTT CTACGCTGGT ATTGTCTGTG ATTGGGTCTG TTGTAATTGT 

TAACTTTATT TTAAGTACCA CCATTTGAAC CTTAACGAAG AAAAAAATCT 
TAACTTTATT TTAAGTACCA CCATTTGAAC CTTAACGAAG AAAAAAATCT 
TAACTTTATT TTAAGTACCA CCATTTGAAC CTTAACGAAG AAAAAAATCT 
TAACTTTATT TTAAGTACCA CCATTTGAAC CTTAACGAAG AAAAAAATCT 

G AAAAAAATCT 

TAACTTTATT TTAAGTACCA CCATTTGAAC CTTAACGAAG AAAAAAATCT 
TCAAGTAGAC CTAGAAGAGA GTTTTAAAAA ACAAAACAAT GTAAGTAAAG 

TCAAGTAGAC CTAGAAGAGA GTTTTAAAAA ACAAAACAAT GTAAGTAAAG i . 

TCAAGTAGAC CTAGAAGAGA GTTTTAAAAA ACAAAACAAT GTAAGTAAAG 
TCAAGTAGAC CTAGAAGAGA GTTTTAAAAA ACAAAACAAT GTAAGTNAAG 
TCAAGTAGAC CTAGAAGAGA GTTTTAAAAA ACAAAACAAT GTAAGTAAAG 
TCAAGTAGAC CTAGAAGAGA GTTTTAAAAA ACAAAACAAT GTAAGTAAAG 

GATATTTCTG AATCTTAAAA TTCATCCCAT CTGTGATCAT AAACTCATAA 
GATATTTCTG AATCTTAAAA TTCATCCCAT CTGTGATCAT AAACTCATAA 

GATATTTCTG AAT f^'' 
GATATTTCTG AATCTTAAAA TTCATCCCAT CTGTGATCAT AAACTCATAA 
GATATTTCTG AATCTTAAAA TTCATCCCAT CTGTGATCAT AAACTCATAA 
GATATTTCTG AATCTTAAAA TTCATCCCAT CTGTGATCAT AAACTCATAA 

AAATAATTTT AAGATGTCGG AAAAGG AT AC TTTGATTAAA TAAAAACACT 
AAATAATTTT AAGATGTCGG AAAAGGATAC TTTGATTAAA TAAAAACACT 

AAATAATTTT AAGATGTCGG AAAAGGATAC TTTGATTAAA TAAAAACACT |$f 
AAATAATTTT AAGATGTCGG AAAAGGATAC TTTGATTAAA TAAAAACACT lift 
AAATAATTTT AAGATGTCGG AAAAGGATAC TTTGATTAAA TAAAAACACT 

CATGGATATG TAAAAACTGT CAAGATTAAA ATTTAATAGT TTCATTTATT ('^S 
CATGGATATG TAAAAACTGT CAAGATrAAA ATTTAATAGT TTCATTTATT 
CATGGATATG TAAAAACTGT CAAGATTAAN ATTTAATAGT TTCATTTATT 
CATGGATATG TAAAAACTGT CAAGATTAAA ATTTAATAGT TTCATTTATT 
CATGGATATG TAAAAACTGT CAAGATTAAA ATTTAATAGT TTCATTTATT 

TGTTATTTTA TTTGTAAGAA ATAGTGATGA ACAAAGATCC TmTCATAC 
TGTTATTTTA TTTGTAAGAA ATAGTGATGA ACAAAGATCC TTTTTCATAC 
TGTTATTTTA TTTGTAAGAN ATAGTGATGA ACAAAGA 
TGTTATTTTA TTTGTAAG 

TGTTATTTTA TTTGTAAGAA ATAGTGATGA ACAAAGATCC TTTTTCATAC 
TGAT 

TGATACCTGG TTGTATATTA TTTGATGCAA CAGTTTTCTG AAATGATATT 
TGATACCTGG TTGTATATTA TTTGATGCAA CAGTTTTCTG AAATGATATT 

TCAAATTGCA TCAAGAAATT AAAATCATCT ATCTGAGTAG TCAAAATACA 
TCAAATTGCA TCAAGAAATT AAAATCATCT ATCTGAGTAG TCAAAATACA 
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